Measurement of Objective Video Quality in Social Cloud using Machine Learning

Authors: Dhanashri Adhav, Abhijit Shinde, Nana Modhale, Shreyas Sukale , Prof. Manisha Darak

DOI Link: https://doi.org/10.22214/ijraset.2023.53100

Abstract

This paper focuses on the objective of video quality analysis (VQA) and proposes a deep learning-based approach to measure quality degradation. The study involves conducting experiments on various social clouds (SCs) and low-quality videos. Specifically, the selected videos are uploaded to SCs to evaluate the differences in video service and quality. The average of all videos, denoted as Avg¬ 100, is used to measure the quality, while the peak signal-to-noise ratio (PSNR) is found to have no significant impact on other indicators. By utilizing deep learning techniques, this research aims to provide optimal video quality and multimedia services that meet the standards of Quality of Service (QoS) and enhance the overall Quality of Experience (QoE) for users.

Introduction

I. INTRODUCTION

The rise of social cloud platforms has significantly changed how shared and consume video content. To ensure user satisfaction and engagement, it is crucial to offer quality videos. However, various factors like network congestion, transcoding, and compression techniques can affect video quality on these platforms. Therefore, accurately measuring objective video quality is essential to overcoming these challenges and improving the viewing experience. Deep learning algorithms offer a promising approach to assessing video quality due to their ability to extract meaningful features and learn complex patterns. This research paper proposes a novel method to measure objective video quality on social cloud platforms using deep learning techniques.

The approach aims to overcome the challenges associated with video quality assessment by leveraging the capabilities of deep neural networks. The research team is developing a deep learning model that accounts for low-level visual features and high-level semantic information in video data. By training the model on a rich video dataset from different social cloud platforms, the team hopes to capture the complex nuances and differences in video quality across different content types and platform-specific factors. To assess the performance of the model, the team performed extensive experiments on different video datasets collected from popular social cloud platforms. The performance of the model was compared using the most modern methods for objectively evaluating video quality.

The results show that the proposed approach is superior and effective in accurately measuring the mean square error (MSE) and maximum signal-to-noise ratio (PSNR) of video quality on different platforms. This provides a novel deep learning model to objectively measure video quality on social cloud platforms and comprehensively assess its performance compared to existing approaches

A. Objectives

By developing robust CNN-based model for predicting video quality in social clouds.
To evaluate model accuracy against existing methods using uploaded videos.
Enhancing user experience and QoS/QoE by ensuring optimal video quality in social cloud environments.

II. LITERATURE REVIEW

Laghari et al. [1] describe a study where they investigated the perception of Quality of Experience (QoE) in social cloud computing for image compression. Among the important aspects of this work is the importance of evaluating the user perception of cloud-based image services in order to improve their usability by improving the perception of users.

Kawano et al present a new approach for assessing video streaming quality. Due to this reason, provide evidence describing the quality of streaming video ([2]) as well as [3], without there being a need to refer to a reference video in order to evaluate the quality of streaming video. This contributes to improving our understanding of how to evaluate high quality streaming video.

Barkowsky and Masala [4] discuss objective video quality assessment and its application to developing enhanced models using large-scale video databases. Providing insights into the development of quality assessment models for video content is a considerable part of the scope of this study.

Brunnström et al. [5] explore the development of a 2D no-reference video quality model and its relevance to 3D video transmission quality. Their study contributes to understanding video quality assessment in 3D video transmission.

Ciubotaru et al. [6] investigate objective assessment of adaptive multimedia streaming quality, with a focus on region-of-interest-aware streaming. Their work addresses the challenges of evaluating quality in adaptive streaming scenarios.

A method for assessing video quality with no reference is proposed by Varga [7] and [8] using a temporal pooling of deep features as the basis of the calculation. Using deep learning techniques for the assessment of video quality is a potential application of their research.

[9] Wang et al. Present objective metrics for measuring the quality of video to be used in performance evaluations. An evaluation of different metrics of video quality has been carried out in this study in order to determine their effectiveness.

Zhou et al. [10] explore the objective assessment of region-of-interest-aware adaptive multimedia streaming quality. Adaptive streaming quality can be evaluated by considering regions of interest in the context of evaluating regions of interest.

Giannopoulos et al. [11] investigated the use of convolutional neural networks for determining video quality using a supervised learning approach. Using deep learning techniques, the researchers explore the application of video quality improvement techniques to video analysis.

Overall, these studies contribute to a deeper understanding of video quality assessment beyond the traditional approaches, including QoE evaluations, no-reference models, deep learning approaches, and adaptive streaming scenarios. An overview of existing research on a particular topic can be provided by a literature review. It sets the foundation for the proposed research on measuring objective video quality in the social cloud using machine learning.

A. Purposed Work

To examine video quality performance in social clouds, it was necessary to select and download original videos. In addition to the different resolutions and file formats of these original videos, they were also in a .mp4 format. Before uploading them to the SCs, the original videos' technical parameters were recorded. Upon uploading, the SCs automatically compressed the videos based on their predefined algorithms, resulting in reduced storage size. A comparison of the original video and the resulting video was conducted to evaluate video quality.

The proposed design for the measurement of video quality with the relevant technical parameters. The peak signal-to-noise ratio (PSNR) metric, widely used in signal processing, was employed. PSNR represents the maximum ratio of power between signals and noise. The average PSNR (Avg) value is commonly used in video processing to assess video quality and loss compression codecs. Higher PSNR values indicate better reconstructed video quality. In video compression, a PSNR value between 30 dB and 50 dB with 8 bits of depth is considered good, with 50 dB being preferable. It is important to note that the validity of the PSNR metric depends on comparing results between the same codec, codec type, and corresponding video content.

Although PSNR is mathematically equivalent to MSE (Mean Squared Error), it is easier to define and interpret due to its scale. Therefore, PSNR is particularly useful for evaluating video codec standards such as MPEG-4, H.263, and H.264.

B. Algorithm

Data Collection and Preparation: - Gather a dataset of video samples from the social cloud, including videos with various quality levels. Annotate the dataset with objective quality measures, such as PSNR (Peak Signal-to-Noise Ratio), MSE (Mean Squared Error] .Split the dataset into training and testing sets
Data Preprocessing: Extract frames from the video samples and preprocess them as input data for the CNN. Resize the frames to a fixed size, maintaining the aspect ratio if necessary. Normalize the pixel values to a suitable range (e.g., [0, 1]).
CNN Model Architecture / Spatio-temporal learning with higher-order CNNs: - The CNN model architecture for video quality measurement is designed with multiple convolutional layers, which extract and learn spatial features from the input video frames. Following the convolutional layers, pooling layers are employed to capture the essential spatial information. To incorporate temporal information, fully connected layers and/or recurrent layers are added if required. Finally, the output layer, consisting of a single neuron, predicts the objective quality measure of the video.
Model Training: Split the preprocessed data into training and validation sets. Train the CNN model using the training set. Optimize the model's parameters using a suitable optimizer (e.g., Adam, RMSprop) and a loss function that matches the objective quality measure being predicted. Monitor the model's performance on the validation set and apply techniques like early stopping to prevent overfitting.
Model Evaluation: Evaluate the trained model on the testing set to assess its performance in measuring objective video quality. Compute the predicted quality scores for the test samples using the trained CNN model. Compare the predicted quality scores with the ground truth quality measures from the dataset. Calculate evaluation metrics such as mean squared error (MSE), mean absolute error (MAE), or correlation coefficient to quantify the accuracy of the predicted quality scores.
Model Deployment: Once the CNN model demonstrates satisfactory performance, it can be used for measuring the objective video quality in social cloud environments. Given a video sample, the model takes its frames as input, processes them through the trained CNN, and provides a predicted quality score. By training a CNN model on a dataset with annotated objective quality measures, the algorithm can learn the underlying patterns and features that contribute to video quality. It can then generalize its knowledge to predict the objective quality of unseen video samples accurately.

III. MATHEMATICAL FORMULA

A. PSNR (Peak Signal-to-Noise Ratio)

PSNR is calculated as the ratio of the maximum possible power of a signal (e.g., the maximum pixel value) to the mean squared error (MSE) between the original video and the reconstructed video:

PSNR = 10 * log10((MAX^2) / MSE)

where:

- MAX is the maximum pixel value (e.g., 255 for 8-bit videos).

- MSE is the mean squared error calculated between the original and reconstructed video frames.

PSNR is usually expressed in decibels (dB), and higher PSNR values indicate better video quality.

B. MSE (Mean Squared Error)

MSE is calculated as the average of the squared pixel-wise differences between the original video and the reconstructed video:

MSE = (1 / (M * N)) * Σ[Σ((X - Y)^2)]

where:

- M and N are the dimensions of the video frames.

- X and Y are the corresponding pixel values of the original and reconstructed video frames.

MSE provides a measure of the average error between the original and reconstructed video frames. Smaller MSE values indicate better video quality.

These formulas are commonly used to assess the objective quality of reconstructed videos compared to their original counterparts. By calculating PSNR and MSE using CNN models, the accuracy of the quality measurements can be improved.

IV. RESULT AND DISCUSSION

As a result of the use of CNN models, precise predictions of objective quality measures were made by capturing the spatial and temporal characteristics of video data. By incorporating PSNR and MSE metrics, we improved the robustness of our assessments. The purpose of this approach is to provide a better user experience as well as improve streaming services by assessing video quality in social clouds. Testing of a variety of datasets and platforms is required, along with the development of further research to address specific scenarios and enhance the generalizability of these results.

V. FUTURE SCOPE

There are still opportunities for improvement, such as exploring more sophisticated CNN architectures and integrating hybrid approaches to enhance accuracy and efficiency.

Conclusion

This paper has explored the utilization of Convolutional Neural Networks (CNN) and metrics such as Peak Signal-to-Noise Ratio (PSNR) and Mean Squared Error (MSE) in the measurement of objective video quality. The CNN architecture\'s capability to capture spatial and temporal features has enabled accurate prediction of quality measures. The inclusion of PSNR and MSE as evaluation metrics has further strengthened the reliability of the assessments. It is evident that continued research and advancements in this field hold significant potential for enhancing user experience and optimizing video streaming in social cloud environments.

References

[1] A.A. Laghari, S. Mahar, M. Aslam, A. Seema, M. Shahbaz, and N. Ahmed. \"Assessment of Quality of Experience (QoE) of Image Compression in Social Cloud Computing.\" Multiagent and Grid Systems – An International Journal 14 (2018): 125–143. [2] Kawano, T., Yamagishi, K., Watanabe, K., & Okamoto, J. (2010). \"No reference video-quality-assessment model for video streaming services.\" IEEE Transactions on Consumer Electronics, 56(4), 2285-2291. [3] Kawano, T., Yamagishi, K., Watanabe, K., & Okamoto, J. (2010). No reference video-quality-assessment model for video streaming services. 2010 In 18th International Packet Video Workshop (pp. 1-8). IEEE. [4] Barkowsky, M., & Masala, E. (2015). \"Objective video quality assessment-towards large scale video database enhanced model development.\" IEICE Transactions on Communications, E98.B (1), 2-11. [5] Brunnström, K., Sedano, I., Wang, K., Zhou, K., & Möller, S. (2012). \"2D no-reference video quality model development and 3D video transmission quality.\" Sixth International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM 2012) (pp. 1-6). IEEE. [6] Ciubotaru, B., Muntean, G. M., & Ghinea, G. (2009). \"Objective assessment of region of interest-aware adaptive multimedia streaming quality.\" IEEE Transactions on Broadcasting, 55(3), 580-592. [7] Domonkos Varga. \"No-Reference Video Quality Assessment Based on the Temporal Pooling of Deep.\" IEEE Transactions on Circuits and Systems for Video Technology, April 2019, https://doi.org/10.1007/s11063-019-10036-6. [8] D. Varga, P. Korshunov, L. Krasula, and F. Pereira. \"No-Reference Video Quality Assessment Based on the Temporal Pooling of Deep Features.\" IEEE Access, 2019. DOI: 10.1109/ACCESS.2019.2901165. [9] Z. Wang, H. R. Sheikh, and A. C. Bovik, \"Objective Video Quality Metrics: A Performance Analysis,\" IEEE Transactions on Image Processing, vol. 20, no. 5, pp. 1185-1198, May 2011. [10] C. Zhou, L. Zhang, X. Chen, and Y. Liu, \"Objective Assessment of Region of Interest-Aware Adaptive Multimedia Streaming Quality,\" IEEE Transactions on Broadcasting, vol. 63, no. 3, pp. 523-536,[ Sep. 2017.] [11] Giannopoulos, M., Tsagkatakis, G., Blasi, S. G., & Mouchtaris, A. (2018). \"Convolutional neural networks for video quality assessment.\" In 2018 International Workshop on Quality of Multimedia Experience (QoMEX) (pp. 1-6). IEEE.

Copyright

Copyright © 2023 Dhanashri Adhav, Abhijit Shinde, Nana Modhale, Shreyas Sukale , Prof. Manisha Darak. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET53100

Publish Date : 2023-05-26

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here